This dashboard shows a handful of data visualization use cases for the Unstructured Data Analytics initiative of the Allegheny County Department of Public Health. This was developed by staff at Chapin Hall and its partners at Argonne National Laboratory.
A simulation of an interactive topic prevalence+trend figure for the case-level visualization
---
title: "ACDHS Dashboard Demo"
author: "Chapin Hall"
date: "August 6, 2018"
output:
flexdashboard::flex_dashboard:
storyboard: true
social: menu
source: embed
vertical_layout: fill
logo: img/AGC-triangle-small.png
css: flexdash.css
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, message = FALSE, warnings = FALSE, error = FALSE)
### Load and--if necessary--install packages ----------------------------------#
package.list <- c("flexdashboard", "data.table", "dplyr", "stringr", "rgdal",
"MASS", "plotly", "leaflet", "ggplot2", "scales", "DT")
for (p in package.list){
if (!p %in% installed.packages()[, "Package"]) install.packages(p)
library(p, character.only = TRUE)
}
### Load custom functions -----------------------------------------------------#
grepv <- function(p, x, ...) grep(p, x, value = TRUE, ...)
cn <- function(x) colnames(x)
```
### Dashboard Demonstration
This dashboard shows a handful of data visualization use cases for the Unstructured Data Analytics initiative of the Allegheny County Department of Public Health. This was developed by staff at Chapin Hall and its partners at Argonne National Laboratory.
- This was made with the free and open-source [R programming language](https://www.r-project.org/about.html)
- Code-driven dashboards like this can be automatically generated to be custom for any given user, and updated with new inputs, analysis, and visualization as part of the full analysis pipeline
- The code and data used to create this example can be found—and reused and contributed to—at [this open-source GitHub repository](https://github.com/chapinhall/acdhs-demo)
- **Note** that all data here is fake, intended only for the purpose of illustration
### Case-Level Visualization {data-commentary-width=500}
***
- This figure combines
- text-mined sentiment score
- text-mined key topics of relevance
- indication of institutional contacts, including DHS investigation and intervention
- Low sentiment scores could be used to trigger specific interventions or referrals to other practitioners, when combined with key topic combinations
- Topic and sentiment scores can help supervisors quickly review timely characteristics of individual cases or a worker's entire caseload
- Persistence of key topics can raise flags to QA teams to complete additional modules, e.g. related to housing concerns
### Interactive Topic Visualization
```{r place interactive topic viz}
load("img/interactive_topic_viz.Rda")
topic_viz
```
***
A simulation of an interactive topic prevalence+trend figure for the case-level visualization
### System-Level Trajectory Visualization {data-commentary-width=500}
```{r generate prevalence figure to match trajectory viz}
df <-
data.frame(clus = c("Cluster 1", "Cluster 2", "Cluster 3"),
freq = c(5442, 679, 227)) %>%
mutate(pct = freq/sum(freq),
label = paste0("n = ", prettyNum(freq, big.mark = ",")))
myplot <-
ggplot(df, aes(x = clus, y = pct, fill = clus)) +
geom_bar(stat = "identity", position = "dodge") +
geom_text(aes(label = label), vjust = 1) +
scale_y_continuous(labels = percent) +
labs(title = "Prevalence of Cluster Types in Population",
x = "", y = "% of Population") +
theme(legend.position = "none",
axis.title = element_text(size = 12),
axis.text = element_text(size = 11))
ggsave("img/dually-involved-youth_traj-viz_3-cluster--prevalence.png")
```
***
- Trajectory visualizations can show pathways of cases through institutional contacts
- Each individual trajectory is shown as a (narrow) horizontal line. Each subplot shows hundreds of individual cases, with cases sorted by their starting state, then the type and duration of the state that follows.
- These plots show the starting age, duration, sequence, and number of spells for different institutional contacts
- The multiple figures show different distinct cluster patterns. In this example:
- "Cluster 1" represents individuals with short involvements which start at a range of ages
- "Cluster 2" represents individuals with long, unbroken stints in child welfare cases. A third of this group has out-of-home placements, almost always at early ages.
- "Cluster 3"--the smallest group--represents individuals with long stints of out-of home placements, generally starting at early ages.
- These institutional involvement cluster types can also be displayed in tables or maps
### Macro-Planning
```{r generate macro planning data}
mpn <- 500
Draw1to100 <- function(n, pwr) round(runif(n)^pwr*100, 0)
planets <- c("Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Neptune", "Uranus")
mp <- data.frame(ID = str_pad(round(runif(mpn)*1e7, 0), width = 7, side = "left", pad = "0"),
AQ = paste0(sample(c(2017, 2018), mpn, replace = TRUE),
"-Q",
sample(1:4, mpn, replace = TRUE)),
Region = factor(sample(planets, mpn, replace = TRUE), levels = planets),
Sentiment = Draw1to100(mpn, 1.0),
MH = Draw1to100(mpn, 2.0),
DA = Draw1to100(mpn, 2.0),
DV = Draw1to100(mpn, 2.0),
H = Draw1to100(mpn, 2.0),
E = Draw1to100(mpn, 0.5),
M = Draw1to100(mpn, 1.0),
SC = Draw1to100(mpn, 0.5)) %>%
filter(!AQ %in% c("2018-Q3", "2018-Q4"))
```
```{r place and format dynamic table}
# Develop color scale for quantiles of all top
# /!\ An alternative scheme might just be for 1-10, 11-20, etc as hard divisions
topics <- c("MH", "DA", "DV", "H", "E", "M", "SC")
topic_brks <- quantile(select(mp, one_of(topics)), probs = seq(.05, .95, .05), na.rm = TRUE)
# Develop colors as increasing shades of blue
topic_clrs <-
round(seq(255, 40, length.out = length(topic_brks) + 1), 0) %>%
paste0("rgb(", ., ",", ., ",255)")
topicNames <- c("Mental Health", "Drug/Alcohol", "Domestic Violence", "Housing",
"Education", "Medical", "Social Connection")
sent_brks <- quantile(mp$Sentiment, probs = seq(.05, .95, .05), na.rm = TRUE)
sent_clrs <- colorRampPalette(c("firebrick2", "forestgreen"))(length(sent_brks) + 1)
colnames(mp) <- c("ID", "Assessment Quarter", "Region", "Sentiment", topicNames)
rownames(mp) <- NULL
datatable(mp,
extensions = 'Buttons',
filter = 'top',
rownames = FALSE,
options = list(dom = 'Bfrtip',
buttons = list('copy', 'print', list(
extend = 'collection',
buttons = c('csv', 'excel', 'pdf'),
text = 'Download')),
pageLength = 20)) %>%
formatStyle("Sentiment", backgroundColor = styleInterval(sent_brks, sent_clrs), color = "white") %>%
formatStyle(topicNames, backgroundColor = styleInterval(topic_brks, topic_clrs))
```
***
### System-Level Topic Map
```{r display zipcode map with topic layers}
# Get zipcode boundary data
zips <- readOGR(dsn = "data", layer = "Allegheny_County_Zip_Code_Boundaries", verbose = FALSE)
# Sample topics:
topics <- c("Pregnant", "Domestic Violence", "Opiates", "Suicide", "Low Income")
sigma <- matrix(c(1.0, 0.2, 0.1, 0.2, 0.4,
0.2, 1.0, 0.3, 0.4, 0.5,
0.1, 0.3, 1.0, 0.4, 0.6,
0.2, 0.4, 0.4, 1.0, 0.4,
0.4, 0.5, 0.6, 0.4, 2.0), nrow = 5)
mu <- c(0.8, 1.5, 0.0, -1.0, 1.0)
# Draw correlated values
set.seed(8062018)
draws <- mvrnorm(n = nrow(zips@data), mu = mu, Sigma = sigma)
# Put this on a 0-100 scale, label, and arrange in order (to give a pattern to matching with zips)
draws <- round((draws - min(draws))/(max(draws) - min(draws))*100, 1)
colnames(draws) <- topics
draws <- arrange(as.data.frame(draws), -`Low Income`)
# Attach values to zipcodes
zips@data <-
cbind(zips@data, draws) %>%
within({
topicPop <- paste0("Topic Scores for Zip ", ZIP, ":
",
"Pregnant: ", sprintf("%1.1f", Pregnant), "
",
"Domestic Violence: ", sprintf("%1.1f", `Domestic Violence`), "
",
"Opiates: ", sprintf("%1.1f", Opiates), "
",
"Suicide: ", sprintf("%1.1f", Suicide), "
",
"Low Income: ", sprintf("%1.1f", `Low Income`), "
",
"More Census data on zip code ", ZIP, "")
geo_label <- paste0(str_to_title(NAME), ", zip: ", ZIP)
})
palBin_blue <- colorBin(palette = "Blues", domain = c(0, 100), bins = 5)
myOpacity <- 0.7
leaflet(data = zips) %>% # width = "100%"
#addTiles() %>%
addProviderTiles("CartoDB.Positron") %>%
addPolygons(group = "Pregnant", fillColor = ~palBin_blue(Pregnant),
popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity,
label=~geo_label, labelOptions= labelOptions(direction = 'auto'),
highlightOptions = highlightOptions(
color='#ff0000', opacity = 1, weight = 2, fillOpacity = 1,
bringToFront = TRUE, sendToBack = TRUE)) %>%
addPolygons(group = "Domestic Violence", fillColor = ~palBin_blue(`Domestic Violence`),
popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity,
label=~geo_label, labelOptions= labelOptions(direction = 'auto'),
highlightOptions = highlightOptions(
color='#ff0000', opacity = 1, weight = 2, fillOpacity = 1,
bringToFront = TRUE, sendToBack = TRUE)) %>%
addPolygons(group = "Opiates", fillColor = ~palBin_blue(Opiates),
popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity,
label=~geo_label, labelOptions= labelOptions(direction = 'auto'),
highlightOptions = highlightOptions(
color='#ff0000', opacity = 1, weight = 2, fillOpacity = 1,
bringToFront = TRUE, sendToBack = TRUE)) %>%
addPolygons(group = "Suicide", fillColor = ~palBin_blue(Suicide),
popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity,
label=~geo_label, labelOptions= labelOptions(direction = 'auto'),
highlightOptions = highlightOptions(
color='#ff0000', opacity = 1, weight = 2, fillOpacity = 1,
bringToFront = TRUE, sendToBack = TRUE)) %>%
addPolygons(group = "Low Income", fillColor = ~palBin_blue(`Low Income`),
popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity,
label=~geo_label, labelOptions= labelOptions(direction = 'auto'),
highlightOptions = highlightOptions(
color='#ff0000', opacity = 1, weight = 2, fillOpacity = 1,
bringToFront = TRUE, sendToBack = TRUE)) %>%
addLayersControl(baseGroups = c("Pregnant", "Domestic Violence", "Opiates", "Suicide", "Low Income"),
options = layersControlOptions(collapsed = FALSE)) %>%
addLegend(position = "bottomleft", colors = unique(palBin_blue(0:100)),
title = "Topic Scores",
labels = c("0-20", "20-40", "40-60", "60-80", "80-100"))
```
***
* Dynamic maps offer panning, zooming, and "popup" annotations that allow exploration of geographic trends in various case topics and combinations thereof, such as "opiod" and "overdose"
* This map shows an example of multiple topic layers for Alleghency county zip codes, with more information available through clicking